面向小样本3D点云语义分割的多层次全局感知模型

doi:10.16451/j.cnki.issn1003-6059.202603003

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (956 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要小样本点云语义分割在全局上下文建模、特征对齐及语义引导方面存在不足,难以应对结构复杂、语义模糊及噪声干扰等场景.因此,文中提出面向小样本3D点云语义分割的多层次全局感知模型,基于多层窗口划分,逐步扩展感受野,实现局部几何与全局语义协同建模.设计双域注意力融合模块,结合通道注意力与点域注意力,融合局部信息与全局信息.构建全局辅助点机制,在关键层嵌入可学习的全局点,增强跨层特征传递功能.设计全局类别感知损失,在点级监督上增加类别分布约束,突出小类别目标.消融实验验证各模块的有效性和互补性,对比实验表明文中模型性能较优,尤其在复杂场景中表现出色.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	严唯书
	袁健
	杨明睿
	许嘉汇

关键词 ：元学习, 小样本学习, 点云语义分割, 特征融合

Abstract：Few-shot point cloud semantic segmentation struggles to handle complex structures, ambiguous semantics and noise interference due to its limitations in global context modeling, feature alignment and semantic guidance. To address these issues, a multi-level global-aware model for few-shot 3D point cloud semantic segmentation(MGAM) is proposed. On the basis of multi-level window partitioning, the receptive field is progressively expanded to achieve collaborative modeling of local geometry and global semantics. A dual-domain attention fusion module(DAFM) is designed. Channel attention and point-wise attention are integrated to fuse local and global information. A global auxiliary point mechanism(GAPM) is constructed. Learnable global points are embedded into key layers to enhance cross-layer feature propagation. A global category-aware loss(GCAL) is developed. Category distribution constraints are added to point-level supervision to highlight small-class targets. Ablation experiments verify the effectiveness and complementarity of each module. Comparative experiments show superior performance of MGAM, particularly in complex scenarios.

Key words： Meta-Learning Few-Shot Learning Point Cloud Semantic Segmentation Feature Fusion

收稿日期: 2025-12-16

ZTFLH:

TP391

基金资助:国家自然科学基金项目(No.61775139)资助

通讯作者: 袁健,博士,副教授,主要研究方向为图像处理、自然语言处理、深度学习等.E-mail:yuanjianwq@163.com.

作者简介: 严唯书,硕士研究生,主要研究方向为计算机视觉.E-mail:233360786@st.usst.edu.cn.
杨明睿,硕士研究生,主要研究方向为人工智能生成内容.E-mail:232340583@st.usst.edu.cn.
许嘉汇,硕士研究生,主要研究方向为图像处理.E-mail:233360797@st.usst.edu.cn.

引用本文:

严唯书, 袁健, 杨明睿, 许嘉汇. 面向小样本3D点云语义分割的多层次全局感知模型[J]. 模式识别与人工智能, 2026, 39(3): 225-238. YAN Weishu, YUAN Jian, YANG Mingrui, XU Jiahui. Multi-level Global-Aware Model for Few-Shot 3D Point Cloud Semantic Segmentation. Pattern Recognition and Artificial Intelligence, 2026, 39(3): 225-238.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202603003 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2026/V39/I3/225

[1] CHARLES R Q, SU H, KAICHUN M, et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Reco-gnition. Washington, USA: IEEE, 2017: 77-85.
[2] BEHLEY J, GARBADE M, MILIOTO A, et al. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 9296-9306.
[3] RYOO M S, PIERGIOVANNI A J, ARNAB A, et al. TokenLear-ner: What Can 8 Learned Tokens Do for Images and Videos[C/OL]. [2025-11-07]. https://arxiv.org/pdf/2106.11297.
[4] WANG Z Y, NGUYEN C, ASENTE P, et al. PointShopAR: Su-pporting Environmental Design Prototyping Using Point Cloud in Augmented Reality // Proc of the CHI Conference on Human Fa-ctors in Computing Systems. New York, USA: ACM, 2023. DOI: 10.1145/3544548.358077.
[5] JIANG L, YANG Z T, SHI S S, et al. Self-Supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2023: 1168-1178.
[6] 佟国峰,刘永旭,彭浩,等.基于编码特征学习的3D点云语义分割网络.模式识别与人工智能, 2023, 36(4): 313-326.
(TONG G F, LIU Y X, PENG H, et al. 3D Point Cloud Semantic Segmentation Network Based on Coding Feature Learning. Pattern Recognition and Artificial Intelligence, 2023, 36(4): 313-326.)
[7] ZHAO N, CHUA T S, LEE G H. Few-Shot 3D Point Cloud Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 8869-8878.
[8] QI C R, YI L, SU H, et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 5105-5114.
[9] ZHANG Z Y, HUA B S, YEUNG S K. ShellNet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1607-1616.
[10] JIANG M Y, WU Y R, ZHAO T Q, et al. PointSIFT: A Sift-Like Network Module for 3D Point Cloud Semantic Segmentation[C/OL]. [2025-11-07]. https://arxiv.org/pdf/1807.00652.
[11] ZHAO H S, JIANG L, FU C W, et al. PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5560-5568.
[12] THOMAS H, QI C R, DESCHAUD J E, et al. KPConv: Flexible and Deformable Convolution for Point Clouds // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 6410-6419.
[13] WANG Y, SUN Y B, LIU Z W, et al. Dynamic Graph CNN for Learning on Point Clouds. ACM Transactions on Graphics (TOG), 2019, 38(5). DOI: 10.1145/3326362.
[14] YAN X, ZHENG C D, LI Z, et al. PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks with Adaptive Sampling // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 5588-5597.
[15] HU Q Y, YANG B, XIE L H, et al. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 11105-11114.
[16] ZHAO H S, JIANG L, JIA J Y, et al. Point Transformer // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 16239-16248.
[17] MA X, QIN C, YOU H X, et al. Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework[C/OL]. [2025-11-07]. https://arxiv.org/pdf/2202.07123.
[18] HE S T, JIANG X D, JIANG W, et al. Prototype Adaption and Projection for Few-and Zero-Shot 3D Point Cloud Semantic Segmentation. IEEE Transactions on Image Processing, 2023, 32: 3199-3211.
[19] WANG J H, ZHU H Y, GUO H R, et al. Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-resolution Attention // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2023: 2811-2817.
[20] ZHANG C Y, WU Z Y, WU X Y, et al. Few-Shot 3D Point Cloud Semantic Segmentation via Stratified Class-Specific Attention Based Transformer Network. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(3): 3410-3417.
[21] XU Y T, ZHAO N, LEE G H.Towards Robust Few-Shot Point Cloud Semantic Segmentation[C/OL]. [2025-11-07].https://papers.bmvc2023.org/0081.pdf.
[22] AN Z C, SUN G L, LIU Y, et al. Rethinking Few-Shot 3D Point Cloud Semantic Segmentation // Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 3996-4006.
[23] LIANG Y D, AN P, LIU Q, et al. Density-Aware Few-Parametric Networks for Robust Few-Shot Point Cloud Semantic Segmentation. Neurocomputing, 2025, 665. DOI: 10.1016/j.neucom.2025.132158.
[24] LI Z Y, WANG Y, XIONG G X, et al. Generalized Few-Shot Point Cloud Segmentation via LLM-Assisted Hyper-Relation Ma-tching // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2025: 23063-23073.
[25] WANG C S, HE S T, FANG X, et al. Taylor Series-Inspired Local Structure Fitting Network for Few-Shot Point Cloud Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 2025, 39(7): 7527-7535.
[26] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 9992-10002.
[27] LAI X, LIU J H, JIANG L, et al. Stratified Transformer for 3D Point Cloud Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 8490-8499.
[28] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need[C/OL]. [2025-11-07]. https://arxiv.org/pdf/1706.03762.
[29] ARMENI I, SENER O, ZAMIR A R, et al. 3D Semantic Parsing of Large-Scale Indoor Spaces // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 1534-1543.
[30] DAI A, CHANG A X, SAVVA M, et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2432-2443.